23 research outputs found

    Cloud-Based Speech Technology for Assistive Technology Applications (CloudCAST)

    Get PDF
    The CloudCAST platform provides a series of speech recognition services that can be integrated into assistive technology applications. The platform and the services provided by the public API are described. Several exemplar applications have been developed to demonstrate the platform to potential developers and users

    Characterisation of voice quality of Parkinson’s disease using differential phonological posterior features

    Get PDF
    Change in voice quality (VQ) is one of the first precursors of Parkinson’s disease (PD). Specifically, impacted phonation and articulation causes the patient to have a breathy, husky-semiwhisper and hoarse voice. A goal of this paper is to characterise a VQ spectrum – the composition of non-modal phonations – of voice in PD. The paper relates non-modal healthy phonations: breathy, creaky, tense, falsetto and harsh, with disordered phonation in PD. First, statistics are learned to differentiate the modal and non-modal phonations. Statistics are computed using phonological posteriors, the probabilities of phonological features inferred from the speech signal using a deep learning approach. Second, statistics of disordered speech are learned from PD speech data comprising 50 patients and 50 healthy controls. Third, Euclidean distance is used to calculate similarity of non-modal and disordered statistics, and the inverse of the distances is used to obtain the composition of non-modal phonation in PD. Thus, pathological voice quality is characterised using healthy non-modal voice quality “base/eigenspace”. The obtained results are interpreted as the voice of an average patient with PD and can be characterised by the voice quality spectrum composed of 30% breathy voice, 23% creaky voice, 20% tense voice, 15% falsetto voice and 12% harsh voice. In addition, the proposed features were applied for prediction of the dysarthria level according to the Frenchay assessment score related to the larynx, and significant improvement is obtained for reading speech task. The proposed characterisation of VQ might also be applied to other kinds of pathological speech

    NeuroSpeech

    Get PDF
    NeuroSpeech is a software for modeling pathological speech signals considering different speech dimensions: phonation, articulation, prosody, and intelligibility. Although it was developed to model dysarthric speech signals from Parkinson's patients, its structure allows other computer scientists or developers to include other pathologies and/or measures. Different tasks can be performed: (1) modeling of the signals considering the aforementioned speech dimensions, (2) automatic discrimination of Parkinson's vs. non-Parkinson's, and (3) prediction of the neurological state according to the Unified Parkinson's Disease Rating Scale (UPDRS) score. The prediction of the dysarthria level according to the Frenchay Dysarthria Assessment scale is also provided

    A cross-lingual adaptation approach for rapid development of speech recognizers for learning disabled users

    Get PDF
    Building a voice-operated system for learning disabled users is a difficult task that requires a considerable amount of time and effort. Due to the wide spectrum of disabilities and their different related phonopathies, most approaches available are targeted to a specific pathology. This may improve their accuracy for some users, but makes them unsuitable for others. In this paper, we present a cross-lingual approach to adapt a general-purpose modular speech recognizer for learning disabled people. The main advantage of this approach is that it allows rapid and cost-effective development by taking the already built speech recognition engine and its modules, and utilizing existing resources for standard speech in different languages for the recognition of the users’ atypical voices. Although the recognizers built with the proposed technique obtain lower accuracy rates than those trained for specific pathologies, they can be used by a wide population and developed more rapidly, which makes it possible to design various types of speech-based applications accessible to learning disabled users.This research was supported by the project ‘Favoreciendo la vida autónoma de discapacitados intelectuales con problemas de comunicación oral mediante interfaces personalizados de reconocimiento automático del habla’, financed by the Centre of Initiatives for Development Cooperation (Centro de Iniciativas de Cooperación al Desarrollo, CICODE), University of Granada, Spain. This research was supported by the Student Grant Scheme 2014 (SGS) at the Technical University of Liberec

    Multi-view representation learning via gcca for multimodal analysis of Parkinson's disease

    Get PDF
    Information from different bio-signals such as speech, handwriting, and gait have been used to monitor the state of Parkinson's disease (PD) patients, however, all the multimodal bio-signals may not always be available. We propose a method based on multi-view representation learning via generalized canonical correlation analysis (GCCA) for learning a representation of features extracted from handwriting and gait that can be used as a complement to speech-based features. Three different problems are addressed: classification of PD patients vs. healthy controls, prediction of the neurological state of PD patients according to the UPDRS score, and the prediction of a modified version of the Frenchay dysarthria assessment (m-FDA). According to the results, the proposed approach is suitable to improve the results in the addressed problems, specially in the prediction of the UPDRS, and m-FDA scores

    INTERSPEECH 2013, Fourth Workshop on Speech and Language Processing for Assistive Technologies (SLPAT2013)

    No full text
    International audienceWe are pleased to bring you the Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), held in Grenoble, France on the 21st and 22nd of August, 2013. We received 23 paper submissions, of which 12 were chosen for oral presen- tation and another 5 for poster presentation. In addition, two demo proposals were accepted. All 19 papers are included in this volume. This workshop was intended to bring researchers from all areas of speech and language technology with a common interest in making everyday life more accessible for people with physical, cognitive, sensory, emotional or developmental disabilities. This workshop builds on three pre- vious such workshops (co-located with NAACL HLT 2010, EMNLP in 2011, and NAACL HLT 2012) and includes a special topic, "Speech Interaction Technology for Ambient Assisted Living in the Home", which is a follow-up of two events (ILADI 2012 co-located with JEP- TALN-RECITAL 2012 and a special session in EUSIPCO 2012). The workshop provides an opportunity for individuals from research communities, and the individuals with whom they are working, to share research findings, and to discuss present and future challenges and the potential for collaboration and progress. While Augmentative and Alternative Communication (AAC) is a particularly apt application area for speech and Natural Language Processing (NLP) technologies, we purposefully made the scope of the workshop broad enough to include assistive technologies (AT) as a whole, even those falling outside of AAC. While we encouraged work that validates methods with human experimental trials, we also accepted work on basic-level innovations and philosophy, inspired by AT/AAC related problems. Thus we have aimed at broad inclusivity, which is also manifest in the diversity of our Program Committee. We are very delighted to have Prof. Mark Hawley from the University of Sheffield as invited speaker. In addition we continue our tradition of a panel of AAC users, who will speak on their experiences and perspectives as users of AAC technology. Finally, this year we also have a tour of the DOMUS "smart home" of the Laboratoire d'Informatique de Grenoble. Because of the many submissions and program points, we have for the first time extended the workshop to two full days. We would like to thank all the people and institutions who contributed to the success of the SLPAT 2013 workshop: the authors, the members of the program committee, the member of the organising committee and the invited speaker Mark Hawley. Finally, we would like to thank the Universities of Grenoble for sponsoring and hosting the workshop in the Laboratoire d'Informatique de Grenoble premises. Jan Alexandersson, Peter Ljunglöf, Kathleen F. McCoy, François Portet, Brian Roark, Frank Rudzicz and Michel Vacher Co-organizers of SLPAT 201
    corecore